UFRGS@CLEF2009: Retrieval by Numbers

نویسندگان

  • Thyago Bohrer Borges
  • Viviane Pereira Moreira
چکیده

For UFRGS’s participation on CLEF’s Robust task, our aim was to compare retrieval of plain documents to retrieval using information on word senses. The experimental run which used word-sense disambiguation (WSD) consisted in indexing the synset codes of the senses which had scores higher than a predefined threshold. The documents in both baseline and WSD runs were indexed by Zettair. The metric for comparing queries and documents was OkapiBM25. The results of the experiments show that only 47 topics were helped by the strategy, while 103 had their performances worsened. A statistical t-test has shown that the experimental run which did not use WSD information significantly outperformed the one which did. A deeper analysis of our results and a set of further experiments are now under preparation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BBK-UFRGS@CLEF2009: Query Expansion of Geographic Place Names

For our first participation on CLEF, our aim was to compare plain information retrieval strategies and query expansion and emphasis of geographic terms. ANNIE was used to recognise geographic entities which were expanded using Google's Hierarchical List of Geographical Place Names. The idea was that the expansion would produce more accurate answers. The results have shown the opposite. Our best...

متن کامل

Evaluation of Perstem: A Simple and Efficient Stemming Algorithm for Persian

Persian is a challenging language in the field of NLP. Rightto-left orthography, complex morphology, complicated grammatical rules, and different forms of letters make it an interesting language for NLP research. In this paper we measure the effectiveness of a simple and efficient stemming algorithm, Perstem, on Persian information retrieval. Our experiments on the Hamshahri corpus at CLEF2009 ...

متن کامل

Uma abordagem para a Busca Contextual de Documentos na Internet

This work presents an approach that constructs automatic tools to help users in searching documents in Internet. Contextual use of the words is analyzed in order to eliminate ambiguities and to discover topics interesting to users. Through successive refinements on searches and with user’s feedback, the set of documents converge to the most relevant ones. The main contribution of this work is t...

متن کامل

Using Text Surrounding Method to Enhance Retrieval of Online Images by Google Search Engine

Purpose: the current research aimed to compare the effectiveness of various tags and codes for retrieving images from the Google. Design/methodology: selected images with different characteristics in a registered domain were carefully studied. The exception was that special conceptual features have been apportioned for each group of images separately. In this regard, each group image surr...

متن کامل

UFRGS@CLEF2008: Indexing Multiword Expressions for Information Retrieval

For UFRGS’s participation on CLEF’s Robust task, our aim was to assess the benefits of identifying and indexing Multiword Expressions (MWEs) for Information Retrieval. The approach used for MWE identification was totally statistical, based association measures such as Mutual Information and Chi-square. Contradicting our results on the training topics, the results on the test topics did not show...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009